51 research outputs found

    Brief Announcement: The Fault-Tolerant Cluster-Sending Problem

    Get PDF
    The development of fault-tolerant distributed systems that can tolerate Byzantine behavior has traditionally been focused on consensus protocols, which support fully-replicated designs. For the development of more sophisticated high-performance Byzantine distributed systems, more specialized fault-tolerant communication primitives are necessary, however. In this brief announcement, we identify the cluster-sending problem - the problem of sending a message from one Byzantine cluster to another Byzantine cluster in a reliable manner - as such an essential communication primitive. We not only formalize this fundamental problem, but also establish lower bounds on the complexity of this problem under crash failures and Byzantine failures. Furthermore, we develop practical cluster-sending protocols that meet these lower bounds and, hence, have optimal complexity. As such, our work provides a strong foundation for the further exploration of novel designs that address challenges encountered in fault-tolerant distributed systems

    Coordination-Free Byzantine Replication with Minimal Communication Costs

    Get PDF
    State-of-the-art fault-tolerant and federated data management systems rely on fully-replicated designs in which all participants have equivalent roles. Consequently, these systems have only limited scalability and are ill-suited for high-performance data management. As an alternative, we propose a hierarchical design in which a Byzantine cluster manages data, while an arbitrary number of learners can reliable learn these updates and use the corresponding data. To realize our design, we propose the delayed-replication algorithm, an efficient solution to the Byzantine learner problem that is central to our design. The delayed-replication algorithm is coordination-free, scalable, and has minimal communication cost for all participants involved. In doing so, the delayed-broadcast algorithm opens the door to new high-performance fault-tolerant and federated data management systems. To illustrate this, we show that the delayed-replication algorithm is not only useful to support specialized learners, but can also be used to reduce the overall communication cost of permissioned blockchains and to improve their storage scalability

    Brief Announcement: Revisiting Consensus Protocols through Wait-Free Parallelization

    Get PDF
    In this brief announcement, we propose a protocol-agnostic approach to improve the design of primary-backup consensus protocols. At the core of our approach is a novel wait-free design of running several instances of the underlying consensus protocol in parallel. To yield a high-performance parallelized design, we present coordination-free techniques to order operations across parallel instances, deal with instance failures, and assign clients to specific instances. Consequently, the design we present is able to reduce the load on individual instances and primaries, while also reducing the adverse effects of any malicious replicas. Our design is fine-tuned such that the instances coordinated by non-faulty replicas are wait-free: they can continuously make consensus decisions, independent of the behavior of any other instances

    Practical View-Change-Less Protocol through Rapid View Synchronization

    Full text link
    The emergence of blockchain technology has renewed the interest in consensus-based data management systems that are resilient to failures. To maximize throughput of these systems, we have recently seen several prototype consensus solutions that optimize for throughput at the expense of overall implementation complexity, high costs, and reliability. Due to this, it remains unclear how these prototypes will perform in real-world environments. In this paper, we present the Practical View-Change-Less Protocol PVP, a high-throughput, simple, and reliable consensus protocol. Central to PVP is the combination of (1) a chained consensus design for replicating requests with a reduced message cost; (2) the novel Rapid View Synchronization protocol that enables robust and low-cost failure recovery; and (3) a high-performance concurrent consensus architecture in which independent instances of the chained consensus operate concurrently to process requests with high throughput and without single-replica bottlenecks. Due to the concurrent consensus architecture, PVP greatly outperforms traditional primary-backup consensus protocols such as PBFT (by up to 430%), Narwhal (by up to 296%), and HotStuff (by up to 3803%). Due to its reduced message cost, PVP is even able to outperform RCC, a state-of-the-art high-throughput concurrent consensus protocol, by up to 23%. Furthermore, PVP is able to maintain a stable and low latency and consistently high throughput even during failures.Comment: 16 pages, 14 figure
    • …
    corecore